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ABSTRACT 

We present a new method for the analysis of pecuhar velocity surveys which 
removes contributions to velocities from small scale, nonlinear velocity modes 
while retaining information about large scale motions. Our method utilizes 
Karhunen-Loeve methods of data compression to construct a set of moments 
out of the velocities which are minimally sensitive to small scale power. The set 
of moments are then used in a likelihood analysis. We develop criteria for the 
selection of moments, as well as a statistic to quantify the overall sensitivity of 
a set of moments to small scale power. Although we discuss our method in the 
context of peculiar velocity surveys, it may also prove useful in other situations 
where data filtering is required. 

Subject headings: cosmology: distance scales - cosmology: large scale structure of the 
universe - cosmology: observation - cosmology: theory - galaxies: kinematics and dynamics 
- galaxies: statistics 



1. INTRODUCTION 



Peculiar velocity surveys are an important tool for probing the mass distribution of the 
universe on large scales. In the analysis of these surveys, galaxies or clusters of galaxies are 
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assumed to be tracers of the matter velocity field, which in linear theory is directly related 
to the density field. Thus peculiar velocity data can complement other measures of the 
mass distribution by placing constraints on the properties of the density field, for example, 
the power spectrum of fluctuations. Peculiar velocities also provide a powerful test of the 
gravitational instability theory of structure formation. 

In practice, the use of peculiar velocities to constrain properties of the density field 
is complicated by several factors. First and foremost is the fact that a direct relationship 
between velocity and density fields holds only in linear theory; this necessitates that we focus 
on large enough scales so that linearity can be reasonably assumed. This also requires that 
we can adequately separate large-scale contributions to the velocity field from small-scale, 
nonlinear contributions. 

One of the most straightforward methods of analyzing peculiar velocity data is to ex- 
amine the statistics of low-order moments of the velocity field, for example, the bulk flow 
(Lauer & Postman 1994; Riess, Press & Kirshner 1995). The idea here is that in calculating 
low-order moments the small scale modes will be averaged out, so that the values of these 
moments will reflect only large-scale motion. It has been shown, however, that the sparse- 
ness of peculiar velocity data can lead to small-scale modes making a significant contribution 
to low-order moments through incomplete cancellation (Feldman & Watkins 1994, 1998), an 
effect which up to now has not been quantified. Another drawback of this approach is that 
it utilizes only a fraction of the available information. 

An alternative method is to perform a likelihood analysis using all of the velocity infor- 
mation (Jaffe & Kaiser 1995). An obvious danger here is that retaining small-scale, nonlinear 
contributions to the velocities can lead to unpredictable biases which can skew the results 
(Croft & Efstathiou 1994). This method also has the disadvantage of becoming unwieldy for 
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surveys larger than about a thousand objects. While advances in computing will make this 
less of a problem in the future, clearly a less time-intensive method is desirable. 

In this paper we describe a new method for the analysis of peculiar velocities which 
is designed to separate large and small scale velocity information in an optimal way. The 
method utilizes Karhunen-Loevc methods of data compression to construct a set of moments 
out of the velocities which are minimally sensitive to small scale power; these moments can 
then be used in a likelihood analysis. Overall sensitivity of the set of moments to small 
scales is quantified, and can be controlled through the number of moments retained in the 
analysis. Since the number of moments kept is typically much smaller than the number of 
velocities in the survey, this method has the added advantage of being much more efficient 
than a full analysis of the data. 

Karhunen-Loeve methods (Kenney & Keeping 1954; Kendall & Stuart 1969) have re- 
cently become popular in cosmology. A general discussion of their use in the analysis of large 
data sets was done by Tegmark, Taylor & Heavens (1997). In addition, Karhunen-Loeve 
methods have been applied to the Las Campanas Redshift Survey (Matsubara, Szalay & 
Landy 2000), to velocity field surveys (Hoffman & Zaroubi 2000; Silberman et al. 2001), 
and to the decorrelation of the power spectrum (Hamilton 2000; Hamilton & Tegmark 2000). 
Although we use the same general method, our take on the formalism is quite different. Tak- 
ing advantage of the compression techniques and the Fisher information matrix (Fisher 1935), 
we filter out small-scale, nonlinear velocity modes and retain only information regarding the 
large-scale modes. 

This paper is organized as follows. In Sec. 2 we review likehhood methods for the 
analysis of pecuhar velocities. In Sec. 3 we discuss methods of data compression. In Sec. 4, 
we describe criteria for the selection of a set of optimal moments. In Sec. 5 we describe 
the power spectrum model that will be used for our analysis, and in Sec. 6 we discuss the 
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application of our method to peculiar velocity data. In Sec. 7 we show results from performing 
our analysis on simulated catalogs that illustrate the effects of small-scale, nonlinear power 
and the effectiveness of our method of analysis in removing these effects. Finally, in Sec. 8 
we summarize and discuss our results. 

2. LIKELIHOOD METHODS FOR PECULIAR VELOCITIES 

Several studies have used likelihood methods for the analysis of peculiar velocity data 
(see, e.g.. Kaiser (1988)). Here we review the most straightforward analysis of this type; one 
that works directly with the observed line-of-sight peculiar velocities. Suppose that we are 
given a set of objects with positions rj and line-of-sight peculiar velocities w,. We assume 
that the observed velocity Vi is of the form 

Vi = v(rj) -Ti + Si (1) 

where v(rj) is the fully three-dimensional linear velocity field and 6i is a Gaussian random 
variable accounting for the deviation of a galaxy's measured velocity from the predictions 
of linear theory. We shall model 5i as having variance + a^, where (Tj is the particular 
observational error associated with the ith object and a* accounts for contributions to the 
velocities of all of the galaxies in the survey arising from nonlinear effects as well as from the 
components of the velocity field that have been neglected in the hnear model (Kaiser 1988). 
With these assumptions, the covariance matrix R^j — {vi Vj) takes the form 

R,j = + S,j {al + a^:) (2) 

where = (v(ri) - fj v(rj) -fj). In hnear theory, the covariance matrix R^^^ can be written 
as an integral over the density power spectrum 

^ = (2^/ ^(^)^^)^?^^) = / ^(^)^?(^) (3) 
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where W^j{k) is a tensor window function calculated from the positions and velocity errors 
of the objects in the survey (for more details see Feldman & Watkins (1994); Watkins & 
Feldman (1995)). The derivation of this formula as well as the definition of W^j{k) is given 
in appendix A. 

Given the covariance matrix, we can construct the probability distribution for the line- 
of-sight peculiar velocities 

L{v,,...,VN;P{k)) = ^/\R^\ exp ( ^ -ViRT^\/2\ (4) 

\i,j=l J 

Alternatively, if we are given a set of velocities {vi, ...,vn), we can view L{vi, ...,vn', P{k)) 
as the likelihood functional for the power spectrum P{k). Typically, the power spectrum 
is parameterized by a parameter vector = {9i, ...,9m)] then L{vi, ...,vn; @) becomes a 
likelihood function for the parameters 0. The value of the parameter vector that maximizes 
the likelihood is known as the maximum likelihood estimator, Qml- 

Suppose that the true value of the parameters are given by = @o- The maximum 
likelihood estimate @ml will vary over different realizations of the line-of-sight velocities 
vi,...,vn', we can characterize this variation with the means {{Oml)^ and the variances 
A{eML)i = {{^MLlD - {(Oml)'!)^- It has been shown (Kendall & Stuart 1969) that in the 
limit of a large number of objects, i.e. N ^ oo, the maximum likelihood estimator is the 
best possible estimator of @o in that it is unbiased, {9ml) — ®o, and has the minimum 
possible variances. These minimum variances are given A{9ML)i = ^l^fFii-, where are 
the diagonal elements of the Fisher information matrix, defined by 

^ d9^d9^ ^ ^ ^ 

evaluated at = 0^. Having an estimator that is unbiased and whose variances are charac- 
terized in terms of the Fisher matrix simplifies our analysis considerably. For the remainder 
of this paper we shall assume that the large N limit applies. 



The result that the maximum hkehhood estimator ©ml is unbiased, however, also 
assumes that the velocity field is Gaussian and that the power spectrum can be well described 
by the given parameterization, usually one derived from linear theory. The collapse of small- 
scale, nonlinear density perturbations can cause both of these assumptions to be violated, 
and can result in Gml being biased in an unpredictable way (Croft & Efstathiou 1994). 
In order to recover an unbiased estimator, we shall utilize methods of data compression to 
filter out information about small-scale nonlinear velocities and retaining information about 
large scales where the linear and Gaussian approximations should remain valid. While these 
methods are typically used to reduce the size of an unwieldy data set without the loss of 
information, here we are instead interested in using data compression as a filter of unwanted 
information. 

Given the difficulty in treating the general case, in the following wc retain the model of 
a Gaussian velocity field. The assumption here is that the primary effect of the collapse of 
nonlinear perturbations is the modification of the power spectrum on small scales and that 
departures from Gaussianity are small enough not to effect our analysis. We will return to 
this issue in Sec. 7 and Sec. 5. 



For a given set of velocity data, the simplest form of data compression involves replacing 
original line-of-sight velocities {vi, . . . , vn), with N' moments, . . . , mat'), where N' < 
N (for a more detailed discussion of data compression see Tegmark, Taylor & Heavens 
(1997)). In this paper we will concentrate on linear data compression, where the moments 
can, in general, be written as linear combinations of the velocities; 



3. 



DATA COMPRESSION 



TV 




- 7- 



where Bij is a constant N' x N matrix. If the number of moments N' is less than N, then 
replacing the vi with the Ui will necessarily lead to a loss of information. However, by a 
proper choice of the matrix B^j, we can arrange it so that the information lost is primarily 
associated with scales where nonlinear effects are likely to have caused deviations from linear 
theory. Thus the process of data compression can be used to produce a set of moments which 
are much less sensitive to nonlinear effects than the original line-of-sight velocities but that 
still retain the desired information about large scale power. 

For simplicity, consider a model for the power spectrum in which the power on non- 
linear scales is proportional to a single parameter 9q (we will discuss a specific model of a 
power spectrum of this type below). Given a set of line-of-sight velocities, Vi. . .v^, we 
can determine the value of 9q within a minimum variance of AO"^ = l/Fgg, where Fgg is the 
qqth element of the Fisher Matrix (Eq. 5) as discussed above. The variance A6g is thus a 
measure of how sensitive the data set is to nonlinear scales; the larger the variance, the less 
small-scale information the data contains. 

Now, suppose that we compress all of the velocity information into a single moment, 
u — bi Vi, where 6j is a 1 x A?" set of coefficients. The 1x1 covariance matrix for u will 
be given by 



From the definition of the Fisher matrix (Eq. 5) and the likelihood (Eq. 4) , the 1x1 Fisher 
matrix for the compressed data takes the form 




(6) 
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It's convenient to normalize bi so that the moment u has unit variance, R — "^^j biRijbj 
1; with this normahzation we can write 



Note that this normahzation will hold only for a particular covariance matrix Ry, and 
hence for a particular choice of parameters (see Sec. 6) 

Since the variance AO^ is inversely proportional to Fgg, we can find the single moment 
that carries the minimum information about 9q by finding the bi which minimizes the quantity 

on the right hand side, subject to the normalization constraint. This problem is solved by 
introducing a Lagrange multiplier and extremizing the quantity 

Y.h^b, - \b,R,,b, (9) 

ij 1 

with respect to bi, resulting in the equation 

E(^)^. = E^^^A (10) 

j=l V 9 / j=i 

The equation above clearly represents an eigenvalue problem after multiplying by R~^ 
from the left; however, the matrix for this eigenvalue problem is not normal. Since the 
covariance matrix i?^ is symmetric and positive definite, it can be Cholesky decomposed 
(see, e.g. , Press et al. (1992)) so that Rij — ^p=iLipLjp for some invertible matrix L^. 
Plugging this into the equation above, multiplying both sides by L^^ and summing over i 
we obtain an eigenvalue problem 



i,j,m ^ 1 ' J 



for the real, symmetric matrix L ^ {dTi/d9q) (L ^) . Solving this eigenvalue problem gives 
us a set of N orthogonal eigenvectors J2j ^ji{^n)j with corresponding eigenvalues A„. Each 
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eigenvector has a corresponding moment Un — Yli{^n)iVi- The eigenvalue A„ of a moment 
Un is related to the error bar A9q that one could place on 9q after compressing the velocities 
into that single moment, as can be seen by manipulating the equations above: 
1 1 F? 1 I A I 

= - E 71 asf E 71 m - y (12) 

so that A9q = a/2/ |A|. Thus the moment with the largest |A| is the moment that carries the 
maximum possible amount of information about the parameter 9g. 

The moments m„ are statistically uncorrelated and of unit variance: 

{UnUm) = ^ ~ ^ ^ {{bn)iVi{bm)jVj) = ^ ^^ {pn)iRij{pm) j = ^ ^^ {pn)iLipLjp{bm)j ~ ^nm- (13) 
i,j i,j i,P,j 

Since we are assuming that the v,, and thus the Gaussian random variables, this 

implies that the are statistically independent, so that there is no overlap of information 
among the This suggests that if we convert the velocities Vi . . .v^ into the moments 
Un — Yli{bn)iVi — Ylii^niVii there will be no loss of information, and the transformation 
matrix Bni = {hn)i will necessarily be invertible. Further, the statistical independence of the 
Ui insures that if we compress the data by leaving out selected moments, i.e. by keeping 
only selected rows of Enj, the information contained by those moments will be completely 
removed from the data. The question becomes, then, how to select which moments to leave 
out. 

4. MOMENT SELECTION 

If we order the moments Un in order of increasing eigenvalue, 

|Ai| < IA2I < ... < |Aiv| (14) 

then we can interpret each moment as carrying successively more information about 9q, with 
Un carrying the maximum possible amount of information. Since our goal is to produce 



a data set that is less sensitive to the value of 9q than the original data, we should keep 
moments only up to some N' , thus discarding the moments that carry the most information 
about 9q. However, we would also like to keep as many moments as possible in order to 
retain the maximum information about large scales. 

In order to choose a value of N' , we need to examine what error bar A9q we can put on 
the parameter Oq using the compressed data. Since the moments are independent, we can 
write the Fisher matrix for the N' moments that were not discarded as 

^- = EEK(m.9^(m.) =E^n (15) 

n=l ij 1 / n=l 

SO that the error bar that can be put on Qq using the compressed data is given by 




This result suggests that the number of moments kept, N' ^ should be chosen by adding up 
the sum of the squares of the smallest eigenvalues until the desired sensitivity is reached. 
For the purposes of this paper, we shall adopt the following criterion: First, we estimate 
the true size of the parameter Qq = 9qo from actual peculiar velocity data. Then, we keep 
the largest number N' moments that is still consistent with the requirement that A^^ > 9qo. 
With this requirement, as long as our estimate of the true value of 6q is correct, our final 
set of moments Ui . . .un' will not contain enough information to distinguish the value of 9q 
from zero. 

5. POWER SPECTRUM MODEL 

In order to carry out the program outlined above, we need to construct a model of the 
power spectrum such that the power on nonlinear scales is proportional to a single parameter 
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9q. To this end, we adopt a model for the power spectrum of the form 

P{k)^Pl{k)+egPnl{k), (17) 

where Pi{k) — for k > kni and Pni{k) — for k < kni- Here kni is taken to be the 
wavenumber of the largest scale for which density perturbations have become nonlinear. 
Traditionally, the scale of nonlinearity is related to the radius of a top hat filter that results 
in a density contrast of unity This corresponds to a wavenumber kni = 2/rr = 0.25/iMpc~^, 
using the value = 8/i~^Mpc (Melott & Shandarin 1993). We prefer to use the smaller 
value of kni — 0.2/iMpc~^ in order to ensure a better separation of linear and nonlinear 
information. 

For Pi{k), we use the BBKS parameterization of the power spectrum (Bardeen et al. 
1986) 

Pi{k) = alCk{l + [6A{k/T) + 3{k/Ty-' + (1.7A;/r)']'-''}-'/'-'' (18) 

where F parameterizes the "shape" of the power spectrum and the overall normalization 
is determined by as, the standard deviation of density fluctuations on a scale of 8/i~^Mpc. 
The constant C is determined via the direct relation between erg and the power spectrum. 
For models where the total density parameter ft — 1, the shape parameter is related to the 
density of matter, F = flmh; however, we typically choose to ignore this relationship and 
treat F as a free parameter which we vary independently. 

Since we are interested in reducing the sensitivity of the data to the full range of 
nonlinear scales, we take Pni{k) to be constant in the range of interest, Pni{k) = Po for 
kni < k < kc- While the non-linear power spectrum would be more accurately approximated 
by Pni{k) oc k"^ (Klypin & Melott 1992), this choice tends to emphasize the importance of 
scales with wavenumber just larger than knu whereas we prefer to weight all nonlinear scales 
equally. Our choice of constant Pni{k) also has the benefit of simplifying our calculations 
considerably. We have introduced a largest wavenumber kc for velocity modes to account for 



the fact that modes smaller than the scale of perturbations out of which a galaxy or cluster 
forms will not contribute to its velocity. For the analysis of galaxy data we have adopted a 
value of kc = 6.0/iMpc~^, although the results are fairly insensitive to the exact value chosen. 
For simulated data, it is important to consider the dynamic range of the simulation when 
choosing the value of kc- The constant Pg is set by the requirement that the contribution 
of nonlinear scales to the line-of-siglit velocity dispersion, which is what we have called a* 
above, should be equal to the value estimated from the actual velocity data. Thus our "true" 
value for 9g will be 9qo — 1- We can express Pg in terms of cr* by noting the relationship 
between the velocity power spectrum and the velocity dispersion: 



Since we have taken P^i to be constant over the given interval in k, the integral is trivial, 
giving the relationship 



The value of cr* used will depend on what type of data is being analyzed as well as the 
value of kni- For velocity data drawn from a simulation, the value of al can be calculated 
directly by subtracting the contribution to the theoretical dispersion found using Pi{k) from 
the measured velocity dispersion of objects in the simulated catalog. For real velocity data, 
one can obtain guidance for the choice of cr* from an analysis of the RMS peculiar velocity 
of the survey (Watkins 1997). 



In this section we describe how to apply the formalism outlined above in order to analyze 
a peculiar velocity survey consisting of line-of-sight peculiar velocities Vi and positions ri 
for a set of A^" galaxies. We first discuss the construction of the set of moments ui. . .un' 
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6. ANALYSIS 



-13- 



that are insensitive to nonlinear scales. We then show how these moments can be used to 
evaluate the hkelihood of particular cosmological models. 

Our first step is to construct the covariance matrix R^j for the velocities using Eqs. (1- 
3). However, in order to do this we must specify values for the parameters in the power 
spectrum P{k). For the purposes of finding the moments U\ . . .un/, we choose F = 0.35 
and P — 0°-^(78 = 0.5, the values that we consider to be the "best fit" to a wide variety of 
experimental data. While this choice of parameters will effect the specific choice of moments, 
it should not have a significant effect on the hkelihood results. As for the choice of 6q, for 
simplicity we will take 9q = 0. Since we seek moments that cannot distinguish between the 
"true" value 9q — 1 and 9q — 0, this should also not effect our results. Once Rij is calculated, 
we Cholesky decompose it by finding the invertible matrix Lij such that Rij — X^^i LipLjp. 

The derivative is also required; for our model of the power spectrum it can be 
calculated in an identical way to Rij except that we must replace P{k) with Pni{k). Here 
the only parameter that must be set is the amplitude Pg of the nonlinear power spectrum 
discussed Sec. 5 above. 

The next step is to diagonalize the matrix Ylij-^kl^M^^ij^- This gives us a set of 
eigenvalues A„ and eigenvectors "^j Lji{bn)j- The moment coefficients (6n)fe are obtained 
from the eigenvectors by multiplying by L^^ and summing over i. We order the eigenvalues 
in order of increasing magnitude |Ai| < IA2I < ...|AAr|, and discard the moments with 
n > N', where N' is determined by the condition that 



as discussed in Sec. 3. 

The moments m„ are derived for a single choice of power spectrum parameters; however, 
we would like to use them to calculate the likelihood of power spectrum models with a 




(21) 
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range of parameter values. Since the moments are just linear combinations of the line-of- 
sight velocities, we can in principle use them to evaluate the likelihood of any given power 
spectrum model as long as we can calculate the covariance matrix for the modes in the 
model. It is important to note, though, that the modes will in general not be statistically 
uncorrelated or have unit variance for any choice of parameters except those from which 
they were derived. 

Given a model for the power spectrum P{k), the covariance matrix Rnm for the moments 
Un can be calculated from the covariance matrix for the velocities Vi and positions ri, 

Rnm = (UnUm) = {{bn)iVi{bm)jVj) = {bn)i{ViVj) (bm) j = {hn)iRij{bm)j (22) 

The likelihood for the given power spectrum is then given as usual by 

L(mi, . . . , Miv; P{k)) = sJ\R-^\exp{-UnKLum/2) (23) 

where the values of the moments are calculated from the velocities Vi through m„ = {bn)iVi 
where we sum over repeated indices. 

By calculating the likelihood over a range of power spectrum parameters, the maximum 
likelihood values of the parameters can be determined. If these values differ substantially 
from the initial "guess" used to calculate the moments, then the moments should be recalcu- 
lated using the maximum likelihood values. This process can be iterated until the maximum 
likelihood values are close to the initial "guess" ; this ensures that the moments are optimum 
near the peak of the likelihood function. 

Since our goal has been to reduce the sensitivity of our data to nonlinear velocities, it 
is of interest to examine the contributions to the individual moments m„ from different scale 
modes. By substituting Eq. (22) in Eq. (2), the variance for a given moment Un can be 
written as 

{UnUn) = Rnn = (K)iRij(K)j = (K)iRif(K)j + (K)i(K)Mi + C^*)- (24) 
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The second term is the contribution to the moment due to noise. The first term is the 
contribution to the moment from the velocity field; this term can be further expanded as 

ibn)^R'^\bn), = ^^^0^ / P{k){b^W^^{k){b^), dk ^ ^'^^f J P{k)Wl{k) dk (25) 

where we have defined the window function for the nth moment as 

Wl{k) ^ (6„),W,5(^)(M.- (26) 

where Wfj{k) is given in Eq. (A7) in the appendix. The window function W^{k) tell us the 
sensitivity of the moment Un to the scale corresponding to the wave number k. This gives us 
a check on our method; ideally, the moments that we retain should have window functions 
that are maximum at large scales and relatively small in the region kni> k > kc (see Sec. 5). 
However, since we have chosen our moments by their insensitivity to small scales, there is no 
guarantee that they will necessarily be sensitive to large scales. Indeed, we have found that 
in some cases a small number of the modes found by this method turn out to be insensitive 
to almost all scales. This can occur when a mode is either dominated by far away galaxies 
with large errors or by a close pair of galaxies; a moment representing the difference of the 
velocities of two closely spaced galaxies is sensitive only to scales which are smaller than 
the separation. Since the moments are normalized to have unit variance, the ones with low 
signal to noise can be found by examining the contribution to the variance of each moment 
from the noise part of the covariance matrix; moments with a noise contribution above some 
threshold can be discarded. 



7. RESULTS FROM SIMULATED CATALOGS 

One concern is that the same small-scale, nonlinear effects that we are trying to remove 
can also lead to deviations from Gaussianity, which our method does not account for. While 
it is plausible that these deviations are small enough in typical velocity surveys as to not 
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significantly bias the results, the only way to be sure about this is to test the method on 
reahstic simulated catalogs. 

While a more complete testing of our method will be presented in a subsequent paper, 
in this section we present some results from applying our method to simulated catalogs 
that illustrate the effects of small-scale, nonlinear power and how they are mitigated in our 
analysis. 

For our testing we have chosen simulated catalogs with ^ 1000 galaxies designed to 
mimic the characteristics of the SFI survey (da Costa et al. 1995). The catalogs were drawn 
from a 256^ N-body PM (particle mesh) simulation with V = 0.25 and (3 = Q^'^crg = 0.46. 
In these simulations, the box size was taken to be 512 Mpc and the Hubble constant h = 
H/im km s-^Mpc"^ = 0.75; thus the box size in redshift space corresponds to a diameter of 
38,400 km s~^. Galaxies were identified in these simulations and assigned physical properties. 
To duphcate the characteristics of the SFI survey, galaxies were "observed" by applying the 
same selection criteria. Realistic scatter was added to galaxy properties that duplicates the 
15-20% relative error in the SFI inferred distances. Finally, following Freudling et al. (1995) 
we applied an inhomogeneous Malmquist correction to our catalogs. 

We performed the analysis described in Sec. 6 on these simulated catalogs. In Fig. 1 
we show the window functions for selected moments calculated for a typical catalog in order 
of increasing eigenvalue, with the top plot showing the window functions for the moments 
with the five lowest eigenvalues, the middle showing five others associated with somewhat 
larger eigenvalues, and the bottom plot showing five more selected from the whole range 
of eigenvalues. This demonstrates that selecting moments that are least sensitive to small 
scales does in fact generally result in moments that are most sensitive to large scales; window 
functions of moments with larger eigenvalues are successively larger on nonlinear scales as 
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window functions 




0.2 0.4 0.6 

k (h Mpc-i) 



Fig. 1. — The window functions in units of 6 ■ 10~^. The top panel shows the window 
functions associated with the five smallest eigenvalues. The center panel shows the window 
functions associated with five somewhat larger eigenvalues, while the bottom panel shows the 
window functions selected from the entire range of eigenvalues. For each panel we show the 
eigenvalue rank for each window function, as described in Sec 4. It is clear that the window 
functions corresponding to lower rank eigenvalues probe mostly large scales, whereas window 
functions corresponding to large eigenvalues probe mostly nonlinear scales and thus carry 
information that should not be used in an analysis based on linear theory. 
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expected. Thus the information contained in large eigenvalue moments comes mostly from 
scales where fluctuations are nonlinear and should not be included in a linear analysis. 

For our simulated catalogs, we know the "true" values of F and p. If we use these true 
values as our "guess" (see Sec. 6) to calculate the optimum moments, then the values of these 
moments calculated from the velocities should have unit variance, since the power spectrum 
model should be an excellent fit to the data. However, non-linear effects can cause higher 
order moments to deviate from unit variance. In Fig. 2 we show the sum of the first N 
moments versus moment number N for a typical catalog, where the moments are ranked in 
order of increasing eigenvalue. Note that for small N, the sum tracks a line with unit slope, 
whereas for large N the sum deviates from this line; this is an indication that the non-linear 
effects are causing the large N moments to deviate from unit variance. 

In Fig. 3 we show the results of the hkelihood analysis on a typical catalog for different 
number N' of moments kept. For reference, we also give the value of A9q for each N' as 
discussed in Sec. 4. Here the closed triangles correspond to the maximum likelihood values 
while the contours correspond to 1/2, 1/10, and 1/100 of the maximum likelihood. The 
asterisk symbol corresponds to the input values used for the simulation, i.e. the "true" 
values for F and p. We see that in this case, inclusion of all of the information leads to 
the location of the maximum likelihood being skewed away from the true values (see the 
panel with N' — N). However, when higher order moments are discarded, the location of 
the maximum likelihood corresponds well with the true values. For this particular catalog, 
with (J* = 200km/s, the criterion of Eq. (21) would give A^' ~ 125 for the optimum number 
of moments to keep. The fact that the discarding of higher order moments leads to a much 
better agreement between the maximum likelihood location and the true values is a good 
indication that our analysis method is effectively removing small-scale, nonlinear velocity 
information. 
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Fig. 2. — The sum of the squares of the first moments versus moment number N, where 
the moments are ranked in order of increasing eigenvalue. Note that for small A^, the sum 
tracks a line with unit slope, whereas for large N the sum deviates from this line; this is an 
indication that the non-linear effects are causing the large N moments to deviate from unit 
variance. 
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Fig. 3. — Likelihood vs. T and (3 for the simulated catalog as described in Sec. 7 for different 
A^', the number of moments kept; we also show the value of A6g, the criterion for choosing 
A^' as shown in Eq. (21). The panels show the maximum likelihood value (solid triangles) 
and the contours corresponding to 0.5, 0.1 and 0.01 of the maximum values. The asterisk in 
each panel is the "true" value of F = 0.25 and (3 = 0.457 for the simulation. Increasing the 
number of modes A^' improves the accuracy of the maximum likelihood values up to a point, 
but the inclusion of large eigenvalue modes that carry information mostly from nonlinear 
scales can skew the result away from the true value. 
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Although the plots in Figs. 1-3 were calculated using a single catalog, we note that the 
results from other catalogs that we analyzed were not significantly different. 

8. DISCUSSION AND CONCLUSIONS 

In this paper we have presented a new method for the analysis of peculiar velocity 
surveys which removes contributions to velocities from small scale, nonlinear modes while 
retaining information about large scale motions. Our method selects a set of optimal mo- 
ments constructed as linear combinations of velocities which are minimally sensitive to small 
scales. We have shown how the overall sensitivity of a set of moments to small scales can be 
quantified, and how to control this sensitivity through the choice of the number of moments 
to retain. 

As discussed above, the necessity of assuming Gaussian statistics in our analysis raises 
the possibility that deviations from Gaussianity caused by the collapse of perturbations will 
interfere with the removal of small scale power and introduce additional unpredictable biases. 
While the results of Sec. 7 indicate that deviations from Gaussianity are not having a large 
effect, careful testing of our analysis method using simulated catalogs will be necessary to 
prove its effectiveness at filtering small-scale power. We are currently carrying out tests 
using catalogs drawn from simulations with a variety of parameter values, the results of 
which will be presented in a subsequent paper. This work will explore in more detail such 
issues as moment selection, optimal values for constants to be used in the analysis, and 
the dependence of the results of the analysis on whether the survey objects are galaxies or 
clusters of galaxies and how these objects are selected. We will also investigate differences 
in the small scale power present in simulations produced using PM, P3M and tree codes. 
Once these tests are completed, we plan to apply our formalism to analyze existing velocity 
surveys, including the Mark III and SFI catalogs. 
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One of the merits of the formahsm we have presented is its versatihty; it can be apphed 
to a wide variety of surveys with different geometries and densities. The formahsm also 
allows for each object in the survey to have an independent velocity error. Versatility will 
be especially important as new distance measurement techniques begin to produce large 
surveys that may have a variety of characteristics. Our formalism will be particularly useful 
for surveys which use clusters of galaxies as tracers of the velocity field, which are necessarily 
quite sparse. 

While we have focused on the use of our data compression formalism for the determina- 
tion of power spectrum parameters through a likelihood analysis, it has broad applicability 
as a data filtering technique. Essentially, our formahsm "rotates" the vector consisting of 
survey object velocities into a basis where the covariance matrix is diagonal with the new 
moments ranked as to their sensitivity to small scales. Discarding moments containing small 
scale information is equivalent to setting the value of these moments equal to zero; the vec- 
tor of moments can then be rotated back to the survey object velocity basis. The result is 
essentially a "smoothed" or "hnearized" velocity data set, which can be used as input for 
analysis methods that focus on large scale motions and assume linear theory. The amount 
and scale of the filtering can be adjusted by varying the constants and thresholds used in 
the construction of moments. 

Finally, we note that the general technique we have developed for using data compression 
to filter out unwanted information may be useful in other areas of astrophysics; for example, 
in the analysis of cosmic microwave background data. 

We wish to thank Avishai Dekel and Ami Eldar for illuminating conversations. HAF 
and ALM wish to acknowledge support from the National Science Foundation under grant 
number AST-0070702, the University of Kansas General Research Fund and the National 
Center for Supercomputing Applications. 
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A. APPENDIX 

The line-of-sight velocity fj • v(ri) can be written in terms of the Fourier transform of 
the velocity field 



(27r) 



^ y f, • k vik)e^^-^ (Al) 



The covariance matrix then becomes 



= (f, . v(rO • v(rj)) (A2) 
= J^^j^'^j d'^' (^^ • (f. • k') (^(k)^*(k')) e^(^--^'-^) (A3) 

= y"dA; A;' P^{k)Wf^{k) (A5) 

(A6) 

where we have used the fact that (t'(k)t'*(k')) = P(/i;)5(k — k') and we have defined the 
tensor window function W^j{k) as the integral over the possible directions of the vector k, 



]{k) -^Jdn, (f, • k) (f, • k) e^''(r-'-i) (A7) 



W 

13 



In linear theory, the velocity power spectrum is related to the density power spectrum 

by P^{k) = {Hye)f'\no)P{k), with f{no) ~ ^^-^^ This allows us to write R^"^ as an 
integral over the density power spectrum, 

^ - ^^^0^ / mwl,{k) dK (A8) 
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